Skip to content

Conversation

@alaypatel07
Copy link
Contributor

@alaypatel07 alaypatel07 commented Sep 29, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds support for extended resources feature in DRA dependency as well as add testing manifests for extended resource feature.

Which issue(s) this PR fixes:

Fixes kubernetes/kubernetes#133757

Special notes for your reviewer:

This PR is blocked by the fixes here: kubernetes/kubernetes#134312

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 29, 2025
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Sep 29, 2025
@alaypatel07 alaypatel07 force-pushed the dra-extended-resources branch 2 times, most recently from 2f250e1 to 45252c2 Compare September 30, 2025 00:13
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Sep 30, 2025
@alaypatel07 alaypatel07 force-pushed the dra-extended-resources branch from 45252c2 to 7cef5a6 Compare September 30, 2025 00:17
@alaypatel07 alaypatel07 changed the title WIP: add support for DRAExtendedResources add support for DRAExtendedResources Sep 30, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 30, 2025
@mengqiy
Copy link
Member

mengqiy commented Oct 23, 2025

Discussed offline. And we want to reconcile this with existing DRA test folder to reduce duplication.

# Make sure a Prometheus stack is deployed
./run-e2e.sh cluster-loader2 \
--provider=kind \
--kubeconfig=/root/.kube/config \
Copy link
Contributor

@serathius serathius Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? Don't think everyone uses root user to run their cluster.

```bash
# Make sure a Prometheus stack is deployed
./run-e2e.sh cluster-loader2 \
--provider=kind \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my experience with load scenario default kind cluster configuration doesn't not fulfill some prerequisites (e.g. metric endpoints of KCM/scheduler are not exposed). Have you verified that it works?

- kube-scheduler
- kubelet

2. **DRA Driver**: A DRA driver must be running (installed automatically by the test)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's installed, then it's not a prerequesite? Same for Prometheus.

--nodes=5
```

## Test Flow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the purpose of the following documentation.

parallelism: {{.Replicas}}
completions: {{.Replicas}}
completionMode: {{.Mode}}
activeDeadlineSeconds: 86400 # 24 hours
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need two manifests that differ by 3 lines?

Timeout: 5m

steps:
- name: Start measurements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not move monitoring to submodule?

@mborsz
Copy link
Member

mborsz commented Oct 24, 2025

The change looks safe from CL2 perspective, please address Marek's comments before submitting.

/approve
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alaypatel07, mborsz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2025
@alaypatel07 alaypatel07 force-pushed the dra-extended-resources branch from 7cef5a6 to db99097 Compare October 24, 2025 18:29
@k8s-ci-robot k8s-ci-robot removed the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 24, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 24, 2025
@alaypatel07
Copy link
Contributor Author

@mengqiy @serathius I have consolidated both the dra and dra-extended-resources test into one. PTAL

@serathius can I take the using measurements as submodule as a followup? It will help me land this within the deadline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DRA extended resources performance and scalability testing

5 participants